Approximations for general bootstrap of empirical processes with an 
application to kernel-type density estimation 
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Abstract 

The purpose of thi s note is to provide an approximation for the generalized bootstrapped em pirical process 
achiev ing the rate in lKomlos et al\ (11975b . The proof is based on much the same arguments used in lHorvath et al. 
As a consequence, we establish an approximation of the bootstrapped kernel-type density estimator. 
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Let Xi, X2, ■ ■ ■ be a sequence of independent, identically distributed [i.i.d.] random variables with common distri- 
bution function F(t) = P(X\ < t). The empirical distribution function of X\, . . . , X n is 



I n 

F n(t) = ~ > MXi < t], -00 < t < 00, 

II ^-^ 



(1) 



i=l 



where stands for the indicator function of the event A. Given the sample X\, . . . , X n , let X±, . . . , X^, be 

conditionally independent random variables with common distribution function F n . Let 



F„ 



^ m 

(t) = -J2t{X* <t}, -cxxt< 



00, 



(2) 



denote the classical Efron (or multinomial) bootstrap (see, e.g. Efronl (119791) and Efron and Tibshiranil ([1993) for 
more details). Define the bootstrapped empirical process, a mn , by 



Q, 



i.rt (t) := V^(F min (t) - F n (t)), 



-00 < t < 00. 



(3) 



Among many other things, Bickel and FreedmarJ ( 198 1 ) established weak convergence of the process in ©, which 
enabled t hem t o deduce the asymptotic validity of the bootstrap method in forming confide nce bounds for F(-). 
Sho rackl (|1982h gave a simple proof of weak convergence of the process in © [see also IShorack and Wellner 
(1986), Section 23.1]. The Bickel and Freedman result for a m) „ has been subsequently generalized for empirical 
processes based on observations in R d , d > 1 as well as i n very general sample sp a ces and for various set an d 
function-indexed random objec ts [see, for example feeran (1984), Beran and Millar ( 1986 ). Beran et al. ( 1987 ). 
Gaensslerldl992h . iLohsel dl987[)l. This line of research found its "final results" in the work of iGine and Zinnl (|l989L 
199fl ) and Esorgo and Masonl dl989h . 



*e-mail: salim.bouzebda@upmc.fr 
^e-mail: omar.eldakkak@gmail.com 



1 



By now, the bootstrap is a widely used tool and, therefore, the properties of a m ^ n (i) are of great interest in applied 
as well as in theoretical statistics. In fact, several procedures can actually be described in terms of the empirical 
process a n (t), the limit distributions being functionals of B(F(t)), where B is a Brownian bridge. The fact that the 
limits may depend on the unknown distribution F(t) makes it important that good approximations of these limiting 
distributions be found and that is where the bootstrap proved to be a very effective tool. There is a huge literature 
on the application of the bootstrap methodology to nonparametric kernel density and regression estimation, among 
other statistical procedures, and it is not the purpose of this paper to survey this extensive litera t ure. T his being 
said, it is worthwhile mentioning that the bootstrap as per Efron's original formulation (see Efronl (| 1 9790 ) presents 
some drawbacks. Namely, some observations may be used more than once while others are not sampled at all. To 
overcome this difficulty, a more general formulation of the bootstrap has been devised: the weighted (or smooth) 
bootstrap, which has also been shown to be computationally m ore efficient in sev e ral ap plications. For a survey 
of further results on weighted bootstrap the reader is referred to Barbe and Bertaill (1995). Exactly as for Efron's 
bootstrap, the question of rates of conv ergence is an important one (both in probability and in statistics) and has 
occupied a great number of authors (see lCsorgo and Revesa (|198lh . lKoml6s et a/.l(|1975hlHorvath et al.l (|2000) and 
the references therein). 



In this note, we will consider a version of the Mason-Newton bootstrap (see Mason and Newtonl (|1992l) . and the 
references therein). As will be clear, this approach to bootstrap is very general and allows for a great deal of 
flexibility in applications. Let (X n ) n >i be a sequence of i.i.d. random variables defined on a probability space 
(Q,A,F). We extend (Q,A,F) to obtain a probability space (f2 (7r) ,.A (7r) , P). The latter will carry the independent 
sequences (X n ) n >i and (Z n ) n >i (defined below) and will be considered rich enough as to allow the definition of 
another sequence (B*) of Brownian bridges, independent of all the precedi ng sequences. The possib i lity of such an 



extens ion i s discussed in detail i n liter ature; the reader is referred, e.g., to ICsorgo and Reve sz ( 1981). lKoml6s et al. 



(l 1 



(| 19751 ) and iBerkes and Philippl (|1977I) . In the sequel, whenever an almost sure property is stated, it will be tacitly 
assumed that it holds with respect the the p.m. P defined on the extended space. 

Define a sequence (Z n ) n >x of i.i.d. replies of a strictly positive random variable Z with distribution function G(-), 
independent of the X n 's. In the sequel, the following assumptions on the Z n 's will prevail: 

(Al) E(Z) = 1; E(Z 2 ) = 2 (or, equivalently, Var(Z) = 1). 



(A2) There exists an e > 0, such that 



E(e tz ) < oo for all |i| < e. 



For all n > 1, let T n = Z\ + • — h Z n and define the random weights, 

Wi- n ■= TfT, i = l,...,U. 

The quantity 



F*(t) = WwHXi ^ *}> for ~oo<t< oo. 



(4) 



(5) 



i=i 



will be called generalized (or weighted) bootstrapped empirical distribution function. Analogously, recalling the 
empirical process based on X\, . . . , X n , 



a n (t) = nV 2 (F n (t) - F(t)), - oo< t < oo, 
define the corresponding generalized ( or weighted) bootstrapped empirical process by 

a* n (t) = n l / 2 {F*(t) - F n (t)), - oo < t < oo. 



(6) 



(7) 



The system of weights defined in ((U) appears in lMason and Newtonl (119921) . p. 1617 where it is shown that it satisfies 
assumptions (Wj), (Wji) and (Win) on p. 16 12 of the same reference, so that all the results therein hold for the 
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objects to be treated in this note. In particular, weak convergence for the process a* to a Brownian bridge is proved. 
For more results conce r ning t his version of the the weighted boostrapped empirical process, we refer the reader to 
beheuvels and Derzkol ( 2008 ). Note that, as a special case o f the system of weights we are considering, one can 
obtain the one used for Bayesian bootstrap (see lRubinl (ll98llY ). 



In what follows, we obtain a KMT rate of convergence for this process in sup norm. More precisely, we consider de- 
viations between the generalized bootstrapped empirical process {a* (i) : t 6 R} and a sequence of approximating 
Brownian bridges {B^(F(t)) : t G R} on R. Our main result goes as follows. 

Theorem 1 Let assumptions (Al) and (A2) hold. Then, it is possible to define a sequence of Brownian bridges 
{B*(y) : < y < 1} such that, for all e, rj > 0, there exists N = N(e, rj), such that, for all n > N and all x > 0, 



P[ sup \a* n (t) - B*(F(t))\ >an-V 2 (K 1 ]ogn + x) ) <K 2 exp(- 7 ^^ 
\-oo<t<oo / \ (1 + ej 



+ V, 



(8) 



where K\, K 2 and K% are positive universal constants. 
The proof of Theorem Q] is given in Section [3] 

Remark 1 Theorem Q] implies the following approximation of the weighted bootstrap: 

' log n " 



sup \a* n (t)-B*(F(t))\=0 P 

-oo<i<oo 



n 



1/2 



(9) 



Remark 2 Theorem Q] turns out be useful in obtaining confidence bands for the distribution function of the sample 
data. We formalize this idea as follows: for < a < 1, one has 



lim P ( sup y/n\F n (t) - F(t)\ < c(a) J = P\ sup \B 

n ^°° \-oo<t<oo / \-oo<t<oo 



(F(t))\<<a), . 



(10) 



Note that for each fixed t, B(F(t)) is a zero-mean Gaussian random variable with covariance structure 

E(B(F{t))B{F(s))) = F{t As)- F(t)F(s) 

where t A s := min(t,s). In practice, c(a) can, of course, not be computed since the covariance structure of 
B(F(t)) depends on the unknown cdf F. Instead, suppose {Z^\ . . . , Zn), . . . , {z[ N \ . . . , Zn) are N inde- 
pendent vectors of i.i.d. copies of Z, sampled independently of the Xj's. Define the random variables 



sup la* -(t)| 



3 = 1, 



,N, 



— oo<t<oo 



where a* a denotes the generalized bootstrapped empirical process constructed with the sample (Z± \ . . . , Zn'), 
j = 1, . . . , N. Theorem [T] accounts for the use of the smallest z > such that 



(11) 



rtih 



1 - 

-^l{f<z}>l-a. 



i=i 



as an estimator of c(a) . 



A direct consequence of Theorem Q] and Theorem 1.5 in iHorvath et all (120001) is the following approximation for 
a* (•) based on a Kiefer process 



Theorem 2 There is a Kiefer process {K(t;x);0 < t < 1; < x < cxd} such that 



max sup 

l<fe<n -oo<t<oo 



Y,(^i;n - l/n)l{Xi <t}- K(F(t),k) 



i=l 



Op (n 1/4 (log n 



M 2 \ 



(12) 



3 



2 An application to kernel density estimation 

Let X±, . . . , X n be independent random replicae of a random variable IeE with distribution function F(-). We 
assume that the distribution function F(-) has a density /(•) (with respect to the Lebesgue measure in R). First of 
all, we introduce a kernel density estimator of /(•). To this end, let K (•) be a measurable function fulfilling the 
following conditions 

(Kl) K{-) is of bounded variation and compactly supported on R; 
(K2) K > and / K{u)du = 1. 



Now, define the A kaike-Parzen-Rosenblatt kernel density estimator of /(•) (see lAkaikel dl954h . |Parzenl (11962) and 
Rosenblatt (1956)) as follows for all i£E, estimate f(x) by 



fn,h n ( x ) 



1 



1=1 



x — Xj 



(13) 



where {h n : n > 1} is a sequence of positive constants satisfying the conditions 

h n I and nh n f oo, as n — > oo. 
Secondly, we define the bootstrapped version of / n ^ n (-), by setting for all h n > and x G R, 

where #i ;n is defined in (0]). We will provide an approximation rate for the following process 

7* (x) = ^fn&n (fn,hn( x ) ~ fn,h n {x)) , -oo < X < oo. 



(14) 



(15) 



The following theorem, proved in the next Section, shows that a single bootstrap suffices to obtain the desired 
approximation for non-parametric kernel-type density estimators. 

Theorem 3 Let conditions (Al), (A2), (Kl) and (K2) prevail. Then we can define Brownian bridges {B^(y) : < 
y < 1} such that almost surely along X\,X2, . . . , as n tends to infinity, we have 



sup 

-oo<x<oo 



7n(a0 



Op 



log n 



(16) 



If, moreover, we suppose boundedness of the unknown density, f i.e. if we suppose the existence of M > such 
that sup_ 00<:r<00 f(x) < M, then, almost surely along Xi, X%, . . . , as n tends to infinity, 



sup 

-oo<a;<oo 



7 * n (x) - B* n {F{x))J K(t)dt = Op + h„ v'log/r 1 



(17) 



Remark 3. Under appropriate conditions, and using the same arguments rehearsed in the proof of Theorem[3l it is 
possible to obtain an approximation of a smoothed version of F* . 
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3 Proofs 



Proof of Theorem [TJ In the sequel, we will write || • || to indicate sup_ 



-oo<t<+oo 



I . We have that 



|a*(t) - B* n (F(t))\\ = ||^(F*(i) - F n {t)) - B* n (F(t))\\- 



Now, it is easily seen that 



MK(t) - F n (t)) = -±= Z MXi <t}~ F(t)T n + (F(t) - F n (t))Tr)j 



so that 



where 



where 



and where 



K(t) - B*(F(t))\\ < S 1 (n) + S 2 (n) + 5 3 (n), 









>/n( 



l(F(t)) 



5 2 (n) := 



S*(n) 



n 



'n 



1 



:(F(t) - F n (t)) 



\B*m))\\ 



We start by dealing with the term Ss(n). We will treat the cases x > Cn and x < Cn (C being a strictly 
constant) separately Fix x > Cn arbitrarily. Union bound gives for all n, 



p(s 3 (n) > n-^ 2 (x + clogn)) < P( 5 4 (n) > —^= ) + P ( \\B*{F(t))\\ > 



2^n~ 



2^n~ 



n 



S A (n) := - \\B*(F(t))\\ 



where 



Now, it is known that, for all n > 1 and all x > n > 1, there exists a positive constant c\, such that 

„2 \ 



P(\\KW))\\>^) <ciexp( --) < oxp 



ar 
4ra y 



On the other hand, since strong law of large numbers gives 



n 



0, 



for all e, r\ > 0, there exists N\ = Ni(e, n), such that, for all n > Ni, 



P 



n 



e (0,e) > l-r/. 



Consequently, denoting the law of by independence of the Z n 's from the P n 's gives 



P S 4 (n) > 



P S A {n) > 



2^n 



P SJn) > 



^-1 

X 



G (0,e 



< P 



+ p 



2^ 



n 
- 1 n 



0(0, e) 



n 



1 



2^ 
0(0, e) 



G (0,e) 



< 



P (\\B*(F(t))\\ > 



n 



2W ny 



2 1 T, 



— = y\ Cn(dy)+r] 



< P[\\B* n (F(t))\\> 



< c\ exp 



2y/n(l + e) 2 



+ V 



(25) 



4(1 + e)\ 

where, in the last inequality, we have used (123T ). Combining d23l and (|25T ), we have that, for all e, rj > 0, there 
exists iVi = N\(s, 77), such that, for all n > Ni, 



P yS 3 (n) > n~ 1/2 (x + clogn)J < (1 + ci) exp 
Now we turn to the case < x < Cn. Again, by the union bound, 



4(1 +ef 



+ 77. 



(26) 



)<p{ 


n 






x 



>y?)+P(||^(F(i))||>^). (27) 



Again by (1231 . we have that for all n, 

P(\\B*(F(t))\\>V^)<ciexp(-x/2). 
On the other hand, by (l24l . for all e, 77 > 0, there exists N\ = Ni(e, rj) such that for all n> N\, 



(28) 



> i/~ 

71 



P 







fx 




£(0,e)J 


( 




> \ ~, 






T 

- 1 n 


V 71 





+ p 



iL_i 



> 



0(0, e ; 



< piii 



< p 





^-1 


fx 




) 


> \ ~, 






71 


V n 





G (0,e) +77 



Pn 

— - 1 

71 



> 



n(l + e) 2 

Use Theorem 2.6 in Petrov (1995) to find constants C2 and C3 such that 



P 



P, 



77 



1 



> 



n(l + e) 2 



< C2 exp 



+ 77 



-c 3 x 



(l+e) s 



(29) 



(30) 



Combining (1281 ). (1291) and (I30l >. and plugging in (1271) . we deduce the existence of positive universal constants C4 and 
C5 such that 



P ( ^3(71) > 77 1 / 2 (x + C log Tl) ) < C4exp 



-c 5 x 



+ 77, 



(3D 
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so that one concludes, from (1261 ) and (f3Tb . that for all e, rj > 0, there exists iV = JV(e, rj), such that, for all n > N, 
and all x > 

-C7X 



P (jSs(n) > n 1 ^ 2 (x + clogn)^ < cq exp 



+ 17, 



(32) 



+ e) 2 , 

for some universal constants cq and C7. 

The proof is concluded once we show the existence of universal positive constants c§, eg, C10 and c\\ such that, for 
all e, r/ > 0, there exists ./V2 = AT 2 (e, 77), and ./V3 = A^e, 77) such that, for all n > JV2, and all x > 



P ( 5i (n) > n x / 2 (x + c log n) ) < cs exp 



-c 9 x 



and for all n > N3 and all x > 0, 



-cnx 



Since 



Si(n 



P(^5 2 (n)>n 7 (x + clogre)j < ci exp I ^ +£ y j + r l- 

) ^ (E ^ *> - T ^)J - B n(^(*)) 



(33) 



(34) 



7^ 

J- n 



formula (3.7) in iHorvath et al. (2000) combined with arguments similar to those used for the term 53(71) imply 
(T33]). As for (T34]), formula (3.5) in lHorvath etall d2000h together with the by now usual e, 77 argument conclude the 
proof. □ 

Proof of Theorem [3j We start by proving (PT6l) . We have for x G K, 

K ((x d{n^(F*( s )_F n ( s ))} 



\M^n(fn,h n ( x ) - fn,h n (x) 



K((x - s)/h n )da* n (s). 



Integration by parts implies that 

J K da* (s) = - J <(x - th n )dK(t), 



and 



K I — I dB* n (F(s)) 



B*(F(x-th n ))dK{t). 



(35) 



(36) 



Now, Theorem Q] together with condition (Kl) give 



sup 

-oo<a;<oo 



J <(x - th n )dK (t) - J B*(F(x - th n ))dK (t 

< sup \a* n (u) - B*(F{u))\ [ d\K (t) \ = P < '° g " 

— 00<M<00 J 



(37) 



thus proving (TT6l) . 

Once (fTBT) is at hand, to prove (fTTT ). it suffices to bound 



B*(F(x-th n ))dK(t)-B*(F(x)) 



< J \B*(F(x-th n ))-B* n (F(x))\dK(t) 



(38) 
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in probability. By condition (Kl), and provided the unknown density / is bounded (by a strictly positive constant, 
say M), for n large enough, 

\B* n (F(x - th n )) - B*(F(x))\ < sup \B*(u) - B*(v)\ (39) 

\u — v\<5n 

where 6 n = Mh n . Now, it is always possible to define a Brownian Bridge, {B*(y) : < y < 1 }, on the same 
probability space carrying the sequence of Brownian Bridges {B*(y) : < y < l} n >i, such that for all n, and all 

e > 

P[{25 n log5- 1 }- 1 / 2 sup sup \B* n (u) - B* n (v)\ > 1 + e) 

\ |w—u|</i/ie[0,5 n ] J 

= P I {25 n log5- 1 }- 1/2 sup sup \B*(u) -B*(v)\ > 1 + e j . 

V \u-v\<h he[0,S n ] J 



Since 5 n — > 0, by Theorem 1.4.1 in IC sorgo and Revesa (119811) . we have with probability one 



Thus, as n — > oo, 



giving 



lim {25 n \og5- 1 Y 112 sup sup - B*(v)\ = 1. (40) 

n ^°° |«-t)|<hfce[o,5„] 



P {2 < 5 n log^ 1 } _1/2 sup sup \B*{u) - B*(v)\ > 1 + e ^ 0, 

y |w-v|</lAe[0,« n ] / 

sup sup |B*(tt) - = Op (^SnlogSn 1 ) ■ (41) 

u-v\<hh£[0,8 n ] V / 



I 

Put ([35]), (ESI), OH), (El) and (SB together to obtain 



sup 

-oo<a;<oo 



^ n (x) - B*(F(x)) J dK(t) 



logn / Li 

Op ( —f=- + h n ylogh n j. 



thus completing the proof of Theorem. □ 
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